Improving the performance of batch schedulers using online job runtime classification

نویسندگان

چکیده

Job scheduling in high-performance computing platforms is a hard problem that involves uncertainties on both the job arrival process and their execution times. Users typically provide only loose upper bounds for times, which are not so useful heuristics based processing Previous studies focused applying regression techniques to obtain better time estimates, worked reasonably well improved metrics. However, these approaches require long period of training data. In this work, we propose simpler approach by classifying jobs as small or large prioritizing over ones. Indeed, most impacted queuing delays, but they represent light load incur burden other jobs. The classifier operates online learns using data collected previous weeks, facilitating its deployment enabling fast adaptation changes workload characteristics. We evaluate our four policies seven HPC platform traces. show that: first, incorporating such classification reduces average bounded slowdown all scenarios, second, considered improvements comparable ideal hypothetical situation where scheduler would know advance exact running

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving Parallel Job Scheduling Using Runtime Measurements

We investigate the use of runtime measurements to improve job scheduling on a parallel machine. Emphasis is on gang scheduling based strategies. With the information gathered at runtime, we deene a task classiication scheme based on fuzzy logic and Bayesian estimators. The resulting local task classiication is used to provide better service to I/O bound and interactive jobs under gang schedulin...

متن کامل

Improving Mobile Grid Performance Using Fuzzy Job Replica Count Determiner

Grid computing is a term referring to the combination of computer resources from multiple administrative domains to reach a common computational platform. Mobile Computing is a Generic word that introduces using of movable, handheld devices with wireless communication, for processing data. Mobile Computing focused on providing access to data, information, services and communications anywhere an...

متن کامل

Improving Mobile Grid Performance Using Fuzzy Job Replica Count Determiner

Grid computing is a term referring to the combination of computer resources from multiple administrative domains to reach a common computational platform. Mobile Computing is a Generic word that introduces using of movable, handheld devices with wireless communication, for processing data. Mobile Computing focused on providing access to data, information, services and communications anywhere an...

متن کامل

Dimensionality Reduction and Improving the Performance of Automatic Modulation Classification using Genetic Programming (RESEARCH NOTE)

This paper shows how we can make advantage of using genetic programming in selection of suitable features for automatic modulation recognition. Automatic modulation recognition is one of the essential components of modern receivers. In this regard, selection of suitable features may significantly affect the performance of the process. Simulations were conducted with 5db and 10db SNRs. Test and ...

متن کامل

Improving classification performance using metaclasses

In this paper we propose a new methodology to improve the performance of classifiers on relatively difficult classification problems with complex boundaries between classes, overlapping classes, and a lack of sufficient number of samples for some classes. We investigate the use of contextual information to overcome such problems, especially in the case of class overlapping, high number of class...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Journal of Parallel and Distributed Computing

سال: 2022

ISSN: ['1096-0848', '0743-7315']

DOI: https://doi.org/10.1016/j.jpdc.2022.01.003